Skip to content

[WIP][Java] Use MemorySegment for index serialization#1035

Closed
ldematte wants to merge 14 commits intorapidsai:branch-25.08from
ldematte:java/memorysegment-serialize
Closed

[WIP][Java] Use MemorySegment for index serialization#1035
ldematte wants to merge 14 commits intorapidsai:branch-25.08from
ldematte:java/memorysegment-serialize

Conversation

@ldematte
Copy link
Contributor

NOTE: this WIP/draft uses code from #1033 and #1024 -- don't worry, I'll sort out the merge order of what we want/can merge and update accordingly.

This PR adds a new method to serialize indices (for now, only Cagra -- for demonstration purposes) to memory, instead of relying on a separate temp file. This is cleaner, saves 50% of disk space, and it's up to 10x faster.

The PR includes both the necessary Java changes and the related C API changes -- again, this draft is for discussion and to showcase how this could work; if we like the approach, my plan is to raise separate C and Java PRs (first get the C API change merged, then consume/use it in the Java API).
If we like the approach, a similar change would be needed for deserialization (CagraIndex#build/ cuvsCagraDeserialize).

Benchmarks:

Benchmark                                           (dims)  (size)   Mode  Cnt     Score    Error  Units
CagraSerializationBenchmarks.testSerializeToFile      1024     100  thrpt    5   561.699 ±  3.401  ops/s
CagraSerializationBenchmarks.testSerializeToMemory    1024     100  thrpt    5  6055.371 ± 33.982  ops/s

@ldematte ldematte requested a review from a team as a code owner June 20, 2025 14:42
@copy-pr-bot
Copy link

copy-pr-bot bot commented Jun 20, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions bot added the cpp label Jun 20, 2025
@ldematte ldematte changed the title [Java] Use MemorySegment for index serialization [WIP][Java] Use MemorySegment for index serialization Jun 20, 2025
@ldematte
Copy link
Contributor Author

On the C side things are pretty simple: the C++ API already has an overload accepting a std::ostream; the new C API builds a very simple memory-based std::ostream using the provided pointer + size and calls down to the existing C++ API.

I'd like to refine this further; ATM the C API user (including our Java bindings) needs to "guess" a "large enough" buffer to pass to cuvsCagraSerializeToMemory.
From what I see from

void serialize(raft::resources const& res,
and
void serialize(const raft::resources& res,
std::ostream& os,
const strided_dataset<DataT, IdxT>& dataset)
{
auto n_rows = dataset.n_rows();
auto dim = dataset.dim();
auto stride = dataset.stride();
raft::serialize_scalar(res, os, n_rows);
raft::serialize_scalar(res, os, dim);
raft::serialize_scalar(res, os, stride);
// Remove padding before saving the dataset
auto src = dataset.view();
auto dst = raft::make_host_matrix<DataT, IdxT>(n_rows, dim);
RAFT_CUDA_TRY(cudaMemcpy2DAsync(dst.data_handle(),
sizeof(DataT) * dim,
src.data_handle(),
sizeof(DataT) * stride,
sizeof(DataT) * dim,
n_rows,
cudaMemcpyDefault,
raft::resource::get_cuda_stream(res)));
raft::resource::sync_stream(res);
raft::serialize_mdspan(res, os, dst.view());

it is possible to know the total size of the data before even serializing it (and with no need to grab data from the GPU too).

So I was thinking we may add functions to the C and C++ API to get the serialization size. Something like

cuvsCagraIndexSerializationSize(res, index, include_dataset, &length)

Wdyt?

@chatman
Copy link
Contributor

chatman commented Jun 30, 2025

On the C side things are pretty simple: the C++ API already has an overload accepting a std::ostream; the new C API builds a very simple memory-based std::ostream using the provided pointer + size and calls down to the existing C++ API.

FYI, I was trying to chase this exact optimization, by doing a direct serialization. I've just opened a draft PR #1071, where I was passing an existing file, and have cuVS append the index to it. After the file has been written to, checking the file size would give us the offset.

@cjnolet cjnolet added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Jun 30, 2025
@ldematte
Copy link
Contributor Author

ldematte commented Jul 2, 2025

I was passing an existing file, and have cuVS append the index to it

I thought about that, but it looked a bit of a "hack" to me and I preferred this approach. Let me elaborate.
On the C++ side there is a nice API that uses a std::ostream; on the C side it would be difficult to have something so flexible.
To achieve our goals this can be:

  1. a file name + offset + length
  2. a fd + offset + length
  3. a pointer + length (the caller can use a file, via mmap -- we can do that the Java side too)
  4. a function pointer void (*f)(void*, int), that the C API will use any time it needs to write 1 or more bytes.

I think 4. is the closest to a ostream but it's more complex; 1 and 2 are "weird" in a API IMO; 1 would also require re-opening on the C side something that might also already (exclusively) open for write on the Java side (by ES or Lucene); 2 could avoid that but dealing with file descriptors is difficult/impossible to do in Java. Also, 1 and 2 limit the functionality of the API to files (3/4 are more generic).

That's why I landed with 3 for this draft - it was generic enough and easy to implement.
@benfred, wdyt? I'd really like your input here.
I'm open to explore alternatives, including the function pointer one if you think it could make a better C API.

ldematte added 2 commits July 3, 2025 14:54
…segment-serialize

# Conflicts:
#	java/cuvs-java/src/main/java/com/nvidia/cuvs/CagraIndex.java
#	java/cuvs-java/src/main/java22/com/nvidia/cuvs/internal/BruteForceIndexImpl.java
#	java/cuvs-java/src/main/java22/com/nvidia/cuvs/internal/CagraIndexImpl.java
#	java/cuvs-java/src/main/java22/com/nvidia/cuvs/internal/DatasetImpl.java
#	java/cuvs-java/src/main/java22/com/nvidia/cuvs/spi/JDKProvider.java
#	java/cuvs-java/src/test/java/com/nvidia/cuvs/CagraBuildAndSearchIT.java
@ldematte
Copy link
Contributor Author

Closed in favour of #1085

@ldematte ldematte closed this Jul 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cpp improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Development

Successfully merging this pull request may close these issues.

3 participants